The security of artificial intelligence (AI) is an important research area towards safe, reliable, and trustworthy AI systems. To accelerate the research on AI security, the Artificial Intelligence Security Competition (AISC) was organized by the Zhongguancun Laboratory, China Industrial Control Systems Cyber Emergency Response Team, Institute for Artificial Intelligence, Tsinghua University, and RealAI as part of the Zhongguancun International Frontier Technology Innovation Competition (https://www.zgc-aisc.com/en). The competition consists of three tracks, including Deepfake Security Competition, Autonomous Driving Security Competition, and Face Recognition Security Competition. This report will introduce the competition rules of these three tracks and the solutions of top-ranking teams in each track.
translated by 谷歌翻译
In this work, we focus on the problem of safe policy transfer in reinforcement learning: we seek to leverage existing policies when learning a new task with specified constraints. This problem is important for safety-critical applications where interactions are costly and unconstrained policies can lead to undesirable or dangerous outcomes, e.g., with physical robots that interact with humans. We propose a Constrained Markov Decision Process (CMDP) formulation that simultaneously enables the transfer of policies and adherence to safety constraints. Our formulation cleanly separates task goals from safety considerations and permits the specification of a wide variety of constraints. Our approach relies on a novel extension of generalized policy improvement to constrained settings via a Lagrangian formulation. We devise a dual optimization algorithm that estimates the optimal dual variable of a target task, thus enabling safe transfer of policies derived from successor features learned on source tasks. Our experiments in simulated domains show that our approach is effective; it visits unsafe states less frequently and outperforms alternative state-of-the-art methods when taking safety constraints into account.
translated by 谷歌翻译
真实世界的文本应用程序通常涉及组成广泛的文本控制操作,例如编辑文本W.R.T.属性,操纵关键字和结构,并生成所需属性的新文本。事先的工作通常会学习/芬太尼语言模型(LM)以执行操作的个人或特定子集。最近的研究以插件方式研究了合并操作,通常在复杂序列空间中以昂贵的搜索或优化进行了研究。本文提出了一种新的有效方法,用于在紧凑的文本潜在空间中进行可复合的文本操作。文本潜在矢量的低维度和不同性使我们能够基于给定的任意插入运算符(例如属性分类器)基于普通微分方程(ODE)开发有效的采样器。通过通过有效的适应性将预告片的LMS(例如GPT2)连接到潜在空间,然后我们将采样向量解码为所需的文本序列。灵活的方法允许使用来自不同域中的任何相关数据获取的各种控制操作员(情感,时态,形式,关键字等)。实验表明,在我们的方法中构成这些操作员可以生成或编辑高质量文本,从而在发电质量和效率方面显着改善了以前的方法。
translated by 谷歌翻译
面部表达是传达人类情绪状态和意图的重要因素。尽管在面部表达识别任务(FER)任务中已经取得了显着进步,但由于表达模式的巨大变化和不可避免的数据不确定性而引起的挑战仍然存在。在本文中,我们提出了中级表示增强(MRE)和嵌入图形抑制(GUS)的图表,以解决这些问题。一方面,引入MRE是为了避免表达表示学习以有限数量的高度歧视模式主导。另一方面,引入GUS以抑制表示空间中的特征歧义。所提出的方法不仅具有更强的概括能力来处理表达模式的不同变化,而且具有更强的稳健性来捕获表达表示。对AFF-WILD2的实验评估已验证了该方法的有效性。
translated by 谷歌翻译
我们设计了简单,最佳的政策,以确保在经典的多武器匪徒问题中确保对重尾风险的安全。最近,\ cite {fan2021偏差}表明,信息理论优化的匪徒算法患有严重的重尾风险;也就是说,最糟糕的案例可能会以$ 1/t $的速度慢慢衰减,其中$ t $是时间范围。受其结果的启发,我们进一步表明,广泛使用的政策,例如标准的上限约束政策和汤普森采样政策也会产生重型风险。实际上,对于所有“依赖实例依赖的一致”政策,这种重型风险实际上存在。为了确保对这种重型风险的安全性,对于两臂强盗设置,我们提供了一种简单的政策设计,即(i)具有最差的最佳性能,可用于预期的遗憾$ \ tilde o(\ sqrt {t} )$和(ii)具有最坏的尾巴概率,即以指数率$ \ exp( - \ omega(\ sqrt {t}))$产生线性遗憾衰减。我们进一步证明,尾巴概率的这种指数衰减率在所有具有最差最佳最优性的政策中都是最佳的,这些损失率是预期的。最后,我们使用任意$ k $的武器数量将政策设计和分析改进了一般环境。我们为在政策设计下的任何遗憾阈值中提供详细的尾巴概率表征。也就是说,产生大于$ x $的遗憾的最坏情况是由$ \ exp( - \ omega(x/\ sqrt {kt}))$上限。进行数值实验以说明理论发现。我们的结果揭示了对一致性和轻尾风险之间不兼容的见解,而这表明对预期的遗憾和轻尾风险的最佳最佳性是兼容的。
translated by 谷歌翻译
通过新兴应用程序,如现场媒体电子商务,促销和建议,我们介绍和解决了一般的非静止多武装强盗问题,具有以下两个特征:(i)决策者可以拉动和收集每次期间,从最多$ k \,(\ ge 1)美元的奖励; (ii)手臂拉动后的预期奖励立即下降,然后随着ARM空闲时间的增加,非参数恢复。目的是最大化预期累计奖励超过$ T $时间段,我们设计了一类“纯粹的周期性政策”,共同设置了拉动每个臂的时间。对于拟议的政策,我们证明了离线问题和在线问题的性能保证。对于脱机问题,当已知所有型号参数时,所提出的周期性策略获得1- \ Mathcal O(1 / \ Sqrt {k})$的近似率,当$ k $生长时是渐近的最佳状态到无穷远。对于在线问题时,当模型参数未知并且需要动态学习时,我们将脱机周期性策略与在线策略上的上部置信程序进行集成。拟议的在线策略被证明是对脱机基准的近似拥有$ \ widetilde {\ mathcal o}(n \ sqrt {t})。我们的框架和政策设计可能在更广泛的离线规划和在线学习应用程序中阐明,具有非静止和恢复奖励。
translated by 谷歌翻译
Increasing research interests focus on sequential recommender systems, aiming to model dynamic sequence representation precisely. However, the most commonly used loss function in state-of-the-art sequential recommendation models has essential limitations. To name a few, Bayesian Personalized Ranking (BPR) loss suffers the vanishing gradient problem from numerous negative sampling and predictionbiases; Binary Cross-Entropy (BCE) loss subjects to negative sampling numbers, thereby it is likely to ignore valuable negative examples and reduce the training efficiency; Cross-Entropy (CE) loss only focuses on the last timestamp of the training sequence, which causes low utilization of sequence information and results in inferior user sequence representation. To avoid these limitations, in this paper, we propose to calculate Cumulative Cross-Entropy (CCE) loss over the sequence. CCE is simple and direct, which enjoys the virtues of painless deployment, no negative sampling, and effective and efficient training. We conduct extensive experiments on five benchmark datasets to demonstrate the effectiveness and efficiency of CCE. The results show that employing CCE loss on three state-of-the-art models GRU4Rec, SASRec, and S3-Rec can reach 125.63%, 69.90%, and 33.24% average improvement of full ranking NDCG@5, respectively. Using CCE, the performance curve of the models on the test data increases rapidly with the wall clock time, and is superior to that of other loss functions in almost the whole process of model training.
translated by 谷歌翻译
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.
translated by 谷歌翻译
In the scenario of black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful adversarial perturbation based on query feedback under a query budget. Due to the limited feedback information, existing query-based black-box attack methods often require many queries for attacking each benign example. To reduce query cost, we propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. Specifically, by treating the attack on each benign example as one task, we develop a meta-learning framework by training a meta-generator to produce perturbations conditioned on benign examples. When attacking a new benign example, the meta generator can be quickly fine-tuned based on the feedback information of the new task as well as a few historical attacks to produce effective perturbations. Moreover, since the meta-train procedure consumes many queries to learn a generalizable generator, we utilize model-level adversarial transferability to train the meta-generator on a white-box surrogate model, then transfer it to help the attack against the target model. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance, which is verified by extensive experiments.
translated by 谷歌翻译
Supervised Deep-Learning (DL)-based reconstruction algorithms have shown state-of-the-art results for highly-undersampled dynamic Magnetic Resonance Imaging (MRI) reconstruction. However, the requirement of excessive high-quality ground-truth data hinders their applications due to the generalization problem. Recently, Implicit Neural Representation (INR) has appeared as a powerful DL-based tool for solving the inverse problem by characterizing the attributes of a signal as a continuous function of corresponding coordinates in an unsupervised manner. In this work, we proposed an INR-based method to improve dynamic MRI reconstruction from highly undersampled k-space data, which only takes spatiotemporal coordinates as inputs. Specifically, the proposed INR represents the dynamic MRI images as an implicit function and encodes them into neural networks. The weights of the network are learned from sparsely-acquired (k, t)-space data itself only, without external training datasets or prior images. Benefiting from the strong implicit continuity regularization of INR together with explicit regularization for low-rankness and sparsity, our proposed method outperforms the compared scan-specific methods at various acceleration factors. E.g., experiments on retrospective cardiac cine datasets show an improvement of 5.5 ~ 7.1 dB in PSNR for extremely high accelerations (up to 41.6-fold). The high-quality and inner continuity of the images provided by INR has great potential to further improve the spatiotemporal resolution of dynamic MRI, without the need of any training data.
translated by 谷歌翻译